AITopics | graph transformer model

Collaborating Authors

graph transformer model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient Sharpness-Aware Minimization for Molecular Graph Transformer Models

Wang, Yili, Zhou, Kaixiong, Liu, Ninghao, Wang, Ying, Wang, Xin

arXiv.org Artificial IntelligenceJun-18-2024

Sharpness-aware minimization (SAM) has received increasing attention in computer vision since it can effectively eliminate the sharp local minima from the training trajectory and mitigate generalization degradation. However, SAM requires two sequential gradient computations during the optimization of each step: one to obtain the perturbation gradient and the other to obtain the updating gradient. Compared with the base optimizer (e.g., Adam), SAM doubles the time overhead due to the additional perturbation gradient. By dissecting the theory of SAM and observing the training gradient of the molecular graph transformer, we propose a new algorithm named GraphSAM, which reduces the training cost of SAM and improves the generalization performance of graph transformer models. There are two key factors that contribute to this result: (i) \textit{gradient approximation}: we use the updating gradient of the previous step to approximate the perturbation gradient at the intermediate steps smoothly (\textbf{increases efficiency}); (ii) \textit{loss landscape approximation}: we theoretically prove that the loss landscape of GraphSAM is limited to a small range centered on the expected loss of SAM (\textbf{guarantees generalization performance}). The extensive experiments on six datasets with different tasks demonstrate the superiority of GraphSAM, especially in optimizing the model update process. The code is in:https://github.com/YL-wang/GraphSAM/tree/graphsam

generalization performance, graphsam, perturbation gradient, (11 more...)

arXiv.org Artificial Intelligence

2406.13137

Country:

Asia > China > Jilin Province (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Revisiting Mobility Modeling with Graph: A Graph Transformer Model for Next Point-of-Interest Recommendation

Xu, Xiaohang, Suzumura, Toyotaro, Yong, Jiawei, Hanai, Masatoshi, Yang, Chuang, Kanezashi, Hiroki, Jiang, Renhe, Fukushima, Shintaro

arXiv.org Artificial IntelligenceOct-2-2023

Next Point-of-Interest (POI) recommendation plays a crucial role in urban mobility applications. Recently, POI recommendation models based on Graph Neural Networks (GNN) have been extensively studied and achieved, however, the effective incorporation of both spatial and temporal information into such GNN-based models remains challenging. Extracting distinct fine-grained features unique to each piece of information is difficult since temporal information often includes spatial information, as users tend to visit nearby POIs. To address the challenge, we propose \textbf{\underline{Mob}}ility \textbf{\underline{G}}raph \textbf{\underline{T}}ransformer (MobGT) that enables us to fully leverage graphs to capture both the spatial and temporal features in users' mobility patterns. MobGT combines individual spatial and temporal graph encoders to capture unique features and global user-location relations. Additionally, it incorporates a mobility encoder based on Graph Transformer to extract higher-order information between POIs. To address the long-tailed problem in spatial-temporal data, MobGT introduces a novel loss function, Tail Loss. Experimental results demonstrate that MobGT outperforms state-of-the-art models on various datasets and metrics, achieving 24\% improvement on average. Our codes are available at \url{https://github.com/Yukayo/MobGT}.

dataset, information, recommendation, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3589132.3625644

2310.01224

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.17)
Europe > Germany > Hamburg (0.05)
North America > United States > Nevada (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.88)

Add feedback

Deep Prompt Tuning for Graph Transformers

Shirkavand, Reza, Huang, Heng

arXiv.org Artificial IntelligenceSep-18-2023

Graph transformers have gained popularity in various graph-based tasks by addressing challenges faced by traditional Graph Neural Networks. However, the quadratic complexity of self-attention operations and the extensive layering in graph transformer architectures present challenges when applying them to graph based prediction tasks. Fine-tuning, a common approach, is resource-intensive and requires storing multiple copies of large models. We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning for leveraging large graph transformer models in downstream graph based prediction tasks. Our method introduces trainable feature nodes to the graph and pre-pends task-specific tokens to the graph transformer, enhancing the model's expressive power. By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies, making it suitable for small datasets and scalable to large graphs. Through extensive experiments on various-sized datasets, we demonstrate that deep graph prompt tuning achieves comparable or even superior performance to fine-tuning, despite utilizing significantly fewer task-specific parameters. Our contributions include the introduction of prompt tuning for graph transformers, its application to both graph transformers and message passing graph neural networks, improved efficiency and resource utilization, and compelling experimental results. This work brings attention to a promising approach to leverage pre-trained models in graph based prediction tasks and offers new opportunities for exploring and advancing graph representation learning.

dataset, graph transformer, transformer, (11 more...)

arXiv.org Artificial Intelligence

2309.10131

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.05)
Europe > Austria (0.04)
(7 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Power Law Graph Transformer for Machine Translation and Representation Learning

Gokden, Burc

arXiv.org Artificial IntelligenceJun-27-2021

We present the Power Law Graph Transformer, a transformer model with well defined deductive and inductive tasks for prediction and representation learning. The deductive task learns the dataset level (global) and instance level (local) graph structures in terms of learnable power law distribution parameters. The inductive task outputs the prediction probabilities using the deductive task output, similar to a transductive model. We trained our model with Turkish-English and Portuguese-English datasets from TED talk transcripts for machine translation and compared the model performance and characteristics to a transformer model with scaled dot product attention trained on the same experimental setup. We report BLEU scores of $17.79$ and $28.33$ on the Turkish-English and Portuguese-English translation tasks with our model, respectively. We also show how a duality between a quantization set and N-dimensional manifold representation can be leveraged to transform between local and global deductive-inductive outputs using successive application of linear and non-linear transformations end-to-end.

attention stage, graph transformer model, transformer model, (13 more...)

arXiv.org Artificial Intelligence

2107.02039

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Oregon (0.04)
(8 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback